go to top scroll for more

Projects


Projects: Projects for Investigator
Reference Number BB/K003992/1
Title Multi-relational Association Mining Software for Genome Wide Association Studies
Status Completed
Energy Categories Not Energy Related 20%;
Renewable Energy Sources(Bio-Energy, Applications for heat and electricity) 80%;
Research Types Basic and strategic applied research 100%
Science and Technology Fields BIOLOGICAL AND AGRICULTURAL SCIENCES (Biological Sciences) 100%
UKERC Cross Cutting Characterisation Not Cross-cutting 100%
Principal Investigator Dr A (Amanda ) Clare
No email address given
Computer Science
Aberystwyth University
Award Type Research Grant
Funding Source BBSRC
Start Date 01 July 2012
End Date 30 September 2013
Duration 15 months
Total Grant Value £105,519
Industrial Sectors Transport Systems and Vehicles
Region Wales
Programme Tools and Resources Development Fund (TRDF)
 
Investigators Principal Investigator Dr A (Amanda ) Clare , Computer Science, Aberystwyth University (99.998%)
  Other Investigator Dr L (Lin ) Huang , IBERS, Aberystwyth University (0.001%)
Dr G (Gancho ) Slavov , IBERS, Aberystwyth University (0.001%)
Web Site
Objectives We will produce software for data mining associations between genotype and phenotype. This software will benefit industrial and academic researchers who need to associate genotype and phenotype. This includes those who work in genomic selection (crop and animal breeding specialists), those investigating genetic involvement in human diseases and ageing, and those who want to conduct core functional genomics work. They will have access to open source free software, designed specifically to investigate these relationships.
The Miscanthus research community in particular will benefit from our application of this software to the Miscanthus breeding programme during this project. This bioenergy crop is being investigated to increase our understanding of factors affecting biomass accumulation and the quality of this biomass for downstream processing, such as combustion in power stations or for conversion to liquid fuels, so that we can identify and breed improved Miscanthus varieties. The collection at Aberystwyth exhibits enormous phenotypic and genetic diversity for the development of new varieties optimised for maximum production in varied climates.
Other beneficiaries of this work include schools within the Convergence area of Wales, targetted by the Technocamps activity to help bring computer science to secondary school children. We will produce a bioinformatics 'activity pack' so that the children can learn about how computers are used to help biologists analyse their data, and how the information in DNA can produce different phenotypes as an analogy to how information in computer code can be executed to create different results. Elaine Jensen (a School Regional Champion for the BBSRC) and the BBSRC Inspiring Young Scientists coordinator (Tristan Bunn) will review the bioinformatics activity pack with the intention of publishing it on the BBSRC schoolscience web pages and elibrary in order to make it widely available to teachers and researchers.
Finally we believe that a good demonstration of multi-relational association mining on this problem will benefit both the data mining community and the GWAS community by providing the data mining community with a really challenging problem needing new solutions and by providing the GWAS community with better exposure to data miners and new algorithms.
Abstract This proposal will produce association mining software for genome wide association studies. The software will find multi-relational associations, that is, it will be able to work with data expressed as relations spanning multiple database tables, or expressed as first order predicate logic. In this way we will be able to make use of not just simple marker variations and a basic phenotype, but complex structured phenotype data, information about parental genotype and phenotype, environmental data, information about sequence similarity, geography, longitudinal data and other data as required.
The software will be based on high performance data structures (inverted indices and data compression) to provide an effective solution for large data that cannot easily be handled by existing algorithms. The software will be open source and documented.
Aberystwyth University has a world-leading breeding program for the bioenergy crop, Miscanthus, with a collection of several thousand accessions. We will apply the software to the Miscanthus case study in Aberystwyth. We are currently obtaining genotype data for these collections, and these data will provide an excellent real-world application for the software.
Publications (none)
Final Report (none)
Added to Database 14/04/14